arXiv: 1505.05175v2 [cs.IT] 14 Feb 2017 


TENSOR THETA NORMS AND LOW RANK RECOVERY 


HOLGER RAUHUT AND ZELJKA STOJANAC 


Abstract. We study extensions of compressive sensing and low rank matrix re¬ 
covery to the recovery of tensors of low rank from incomplete linear information. 

While the reconstruction of low rank matrices via nuclear norm minimization 
is rather well-understand by now, almost no theory is available so far for the 
extension to higher order tensors due to various theoretical and computational 
difficulties arising for tensor decompositions. In fact, nuclear norm minimiza¬ 
tion for matrix recovery is a tractable convex relaxation approach, but the 
extension of the nuclear norm to tensors is in general NP-hard to compute. 

In this article, we introduce convex relaxations of the tensor nuclear norm 
which are computable in polynomial time via semidefinite programming. Our 
approach is based on theta bodies, a concept from computational algebraic 
geometry which is similar to the one of the better known Lasserre relaxations. 

We introduce polynomial ideals which are generated by the second order minors 
corresponding to different matricizations of the tensor (where the tensor entries 
are treated as variables) such that the nuclear norm ball is the convex hull of 
the algebraic variety of the ideal. The theta body of order k for such an ideal 
generates a new norm which we call the 6k-norm. We show that in the matrix 
case, these norms reduce to the standard nuclear norm. For tensors of order 
three or higher however, we indeed obtain new norms. The sequence of the 
corresponding unit-^/^-norm balls converges asymptotically to the unit tensor 
nuclear norm ball. By providing the Grdbner basis for the ideals, we explicitly 
give semidefinite programs for the computation of the ^^-norm and for the 
minimization of the ^^-norm under an affine constraint. Finally, numerical 
experiments for order-three tensor recovery via 0i-norm minimization suggest 
that our approach successfully reconstructs tensors of low rank from incomplete 
linear (random) measurements. 

Keywords: low rank tensor recovery, tensor nuclear norm, theta bodies, compressive 
sensing, semidefinite programming, convex relaxation, polynomial ideals, Grdbner 
bases 
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1. Introduction and Motivation 

Compressive sensing predicts that sparse vectors can be recovered from un¬ 
derdetermined linear measurements via efficient methods such as £i-minimization 
uniiniES]. This finding has various applications in signal and image processing 
and beyond. It has recently been observed that the principles of this theory can 
be transferred to the problem of recovering a low rank matrix from underdeter¬ 
mined linear measurements. One prominent choice of recovery method consists in 
minimizing the nuclear norm subject to the given linear constraint [niiHi- This 
convex optimization problem can be solved efficiently and recovery results for certain 
random measurement maps have been provided, which quantify the minimal number 
of measurements required for successful recovery [i El [501 ini in [Ml . 

There is significant interest in going one step further and to extend the theory to 
the recovery of low rank tensors (higher-dimensional arrays) from incomplete linear 
measurements. Applications include image and video inpainting |45j , reflectance data 
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recovery |45j (e.g. for use in photo-realistic raytracers), machine learning |55j . and 
seismic data processing [40]. Several approaches have already been introduced [24l 
but unfortunately, so far, for none of them a completely satisfactory 
theory is available. Either the method is not tractable jS^j, or no (complete) 
rigorous recovery results quantifying the minimal number of measurements are 
available [H m El mmsi ED E2 , or the available bounds are highly nonoptimal 
|2Ql|38lll6]. For instance, the computation (and therefore, also the minimization) of 
the tensor nuclear norm ( [T8||56lEO] ) for higher order tensors is in general NP-hard 
|23j - nevertheless, some recovery results for tensor completion via nuclear norm 
minimization are available in [62| . Moreover, versions of iterative hard thresholding 
for various tensor formats have been introduced iniisg. This approach leads to a 
computationally tractable algorithm, which empirically works well. However, only a 
partial analysis based on the tensor restricted isometry property has been provided, 
which so far only shows convergence under a condition on the iterates that cannot be 
checked a priori. Nevertheless, the tensor restricted isometry property (TRIP) has 
been analyzed for certain random measurement maps |51H53j . These near optimal 
bounds on the number of measurements ensuring the TRIP, however, provide only 
a hint on how many measurements are required because the link between the TRIP 
and recovery is so far only partial [SU . 

This article introduces a new approach for tensor recovery based on convex 
relaxation. The idea is to further relax the nuclear norm in order to arrive at a norm 
which can be computed (and minimized under a linear constraint) in polynomial 
time. The hope is that the new norm is only a slight relaxation and possesses very 
similar properties as the nuclear norm. Our approach is based on theta bodies, 
a concept from computational algebraic geometry niiaiiTj which is similar to 
the better known Lasserre relaxations jUj . We arrive at a whole family of convex 
bodies (indexed by a polynomial degree), which form convex relaxations of the unit 
nuclear norm ball. The resulting norms are called theta norms. The corresponding 
unit norm balls are nested and contain the unit nuclear norm ball. Even more, 
the sequence of the unit-^^-norm balls converges asymptotically to the unit tensor 
nuclear norm ball. They can be computed by semidefinite optimization, and also the 
minimization of the 0fc-norm subject to a linear constraints is a semidefinite program 
(SDP) whose solution can be computed in polynomial time - the complexity growing 
with k. 

The basic idea for the construction of these new norms is to define polynomial 
ideals, where each variable corresponds to an entry of the tensor, such that its 
algebraic variety consists of the rank-one tensors of unit Frobenius norm. The 
convex hull of this set is the tensor nuclear norm ball. The ideals that we propose 
are generated by the minors of order two of all matricizations of the tensor (or 
at least of a subset of the possible matricizations) together with the polynomial 
corresponding to the squared Frobenius norm minus one. Here, a matricization 
denotes a matrix which is generated from the tensor by combining several indices to 
a row index, and the remaining indices to a column index. In fact, all such minors 
being zero simultaneously means that the tensor has rank one. The fc-theta body 
of the ideal corresponds then to a relaxation of the convex hull of its algebraic 
variety, i.e., to a further relaxation of the tensor nuclear norm. The index k € N 
corresponds to a polynomial degree involved in the construction of the theta bodies 
(some polynomial is required to be fc-sos modulo the ideal, see below), and k = 1 
leads to the largest theta body in a family of convex relaxations. 

We will show that for the matrix case (tensors of order 2), our approach does not 
lead to new norms. All resulting theta norms are rather equal to the matrix nuclear 
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norm. This fact suggests that the theta norms in the higher order tensor case are 
all natural generalizations of the matrix nuclear norm. 

We derive the corresponding semidefinite programs explicitly and present numeri¬ 
cal experiments which show that 0i-norm minimization successfully recovers tensors 
of low rank from few random linear measurements. Unfortunately, a rigorous theo¬ 
retical analysis of the recovery performance of ^^-minimization is not yet available 
but will be the subject of future studies. 

1.1. Low rank matrix recovery. Before passing to tensor recovery, we recall 

some basics on matrix recovery. Let X G qJ j-ank at most r <C min{ni,n 2 }, 

and suppose we are given linear measurements 

y = ^(X), 

where A : jg ^ linear map with m nin 2 - Reconstructing X from 

y amounts to solving an underdetermined linear system. Unfortunately, the rank 
minimization problem of computing the minimizer of 

min rank(Z) subject to A{Z) = y 

zeK"iX"2 

is NP-hard in general. As a tractable alternative, the convex optimization problem 

min ||Z|j* subject to yl(Z) = y (1) 

ZgR"l Xn2 

has been suggested [miMi, where the nuclear norm ||Z||* = crj(Z) is the sum of 
the singular values of Z. This problem can be solved efficiently by various methods 
[3]. For instance, it can be reformulated as a semidefinite program [3T], but splitting 
methods may be more efficient [sniisE]. 

While it is hard to analyze low rank matrix recovery for deterministic measurement 
maps, optimal bounds are available for several random matrix constructions. If A 
is a Gaussian measurement map, i.e., 

= '^^jkeXki, j G [m] := {1,2,... ,m}, 

k,l 

where the Ajki, 3 G [m],fc G [ni],£ G [ 71 - 2 ], are independent mean-zero, variance one 
Gaussian random variables, then a matrix X of rank at most r can be reconstructed 
exactly from y = A(X) via nuclear norm minimization 0 with probability at least 
1 — provided that 

m>Crn, n = max{ni, 77 . 2 }, (2) 

where the constants c, C > 0 are universal Moreover, the reconstruction is 

stable under passing to only approximately low rank matrices and under adding 
noise on the measurements. Another interesting measurement map corresponds to 
the matrix completion problem [Tiiiiiiisn], where the measurements are randomly 
chosen entries of the matrix X. Measurements taken as Frobenius inner products 
with rank-one matrices are studied in [42| . and arise in the phase retrieval problem 
as special case [5]. Also here, m > Crn (or m > Crn\og{n) for certain structured 
measurements) is sufficient for exact recovery. 

1.2. Tensor recovery. An order-d tensor (or mode-d-tensor) is an element X G 
]gnixn 2 x--xnd indexed by [ni] x [ 712 ] x • • • x [ud]. Of course, the case d = 2 corresponds 
to matrices. For d > 3, several notions and computational tasks become much 
more involved than for the matrix case. Already the notion of rank requires some 
clarification, and in fact, several different definitions are available, see for instance 
[33113311331133] . We will mainly work with the canonical rank or GP-rank in the 
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following. A dth-order tensor X S ]^nixn 2 x---xnd jg rank one if there exist vectors 


e 




,u'^ G K""* such that X = i 


I u 


) or elementwise 


X,;, 




The CP-rank (or canonical rank and in the following just rank) of a tensor X G 
]R"iX"' 2 x---xnd^ similarly as in the matrix case, is the smallest number of rank-one 
tensors that sum up to X. 

Given a linear measurement map A : K’" (which can represented 

as a (d -I- l)th-order tensor), our aim is to recover a tensor X G K"iX'"Xnd from 
y = A(X) when m <C ni • n 2 • • • The matrix case d = 2 suggests to consider 
minimization of the tensor nuclear norm for this task, 

iiun||Z||* subject to A(Z) = y, 
where the nuclear norm is defined as 


IIXIL = 


|cfc| : X 


fc=l 




) u 


2,fc 


1 U 


d.k 


,r G N, 


u 


1^2 


= l,z G [d] ,fc G [r] |. 


Unfortunately, in the tensor case, computing the canonical rank of a tensor, as well 
as computing the nuclear norm of a tensor is NP-hard in general, see I11IMI37]. 
Let us nevertheless mention that some theoretical results for tensor recovery via 
nuclear norm minimization are contained in [62) . 

We remark that, unlike in the matrix scenario, the tensor rank and consequently 
the tensor nuclear norm are dependent on the choice of base field, see for example 
I1III1I23]- In other words, the rank (and the nuclear norm) of a given tensor with 
real entries depends on whether we regard it as a real tensor or as a complex tensor. 
In this paper, we focus only on tensors with real-valued entries, i.e., we work over 
the field K. 

The aim of this article is to introduce relaxations of the tensor nuclear norm, based 
on theta bodies, which is both computationally tractable and whose minimization 
allows for exact recovery of low rank tensors from incomplete linear measurements. 

Let us remark that one may reorganize (flatten) a low rank tensor X G 
into a low rank matrix X G and simply apply concepts from matrix recovery. 

However, the bound i) on the required number of measurements then reads 

m > Crn^. (3) 

Moreover, it has been suggested in IHIlSlISi to minimize the sum of nuclear norms 
of the unfoldings (different reorganizations of the tensor as a matrix) subject to 
the linear constraint matching the measurements. Although this seems to be a 
reasonable approach at first sight, it has been shown in [35], that it cannot work 
with less measurements than stated by the estimate in ([^. This is essentially due 
to the fact that the tensor structure is not represented. That is, instead of solving a 
tensor nuclear norm minimization problem under the assumption that the tensor is 
of low rank, the matrix nuclear norm minimization problem is being solved under 
the assumption that a particular matricization of a tensor is of low rank. 

Bounds for a version of the restricted isometry property for certain tensor formats 
in [53] suggest that 

m > Cr^n 

measurements should be sufficient when working directly with the tensor structure - 
precisely, this bound uses the tensor train format [35]. (Possibly, the term A may 
even be lowered to r when using the “right” tensor format.) However, connecting 
the restricted isometry property in a completely satisfactory way with the success 
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of an efficient tensor recovery algorithm is still open. (Partial results are contained 
in [53].) In any case, this suggests that one should exploit the tensor structure of 
the problem rather than reducing to a matrix recovery problem in order to recover 
a low rank tensor using the minimal number of measurements. Of course, similar 
considerations apply to tensors of order higher than three, where the difference 
between the reduction to the matrix case and working directly with the tensor 
structure will become even stronger. 

Unlike in the previously mentioned contributions, we consider the canonical tensor 
rank and the corresponding tensor nuclear norm, which respects the tensor structure. 
Even more, it is expected that the bound on the minimal number of measurements 
needed for low rank tensor recovery via tensor nuclear norm minimization is optimal, 
see also |62j . where tensor completion via tensor nuclear norm minimization has 
been considered. Unfortunately, it is in general NP-hard to solve this optimization 
problem (since it is NP-hard to compute the tensor nuclear norm). To overcome 
this difficulty, in this paper, we provide the tensor 0fc-norms - the new tensor norms 
which can be computed via semidefinite programming. These norms are tightly 
related to the tensor nuclear norm. That is, the unit 0fc-norm balls (which are 
defined for fc S N) satisfy 

{X : ||X||,^ < 1} 3 ... D {X : ||X||,^ < 1} 3 {x : ||X||,^^^ < l} 

D...D{X:||X||,<1}. 

In particular, we show that in the matrix scenario all ^^-norms coincide with the 
matrix nuclear norm. In case of order-d tensors (d > 3), we prove that the sequence 
of the unit-dfc-norm balls converges asymptotically to the unit tensor nuclear norm 
ball. Next, we provide numerical experiments on low rank tensor recovery via 
di-norm minimization. We provide numerical experiments for 0i-minimization that 
indicate that this is a very promising approach for low rank tensor recovery. However, 
we note that standard solvers for semidefinite programs only allow us to test our 
method on small to moderate size problems. Nevertheless, it is likely that specialized 
efficient algorithms can be developed. Indeed, recall that d^-norms all coincide with 
the matrix nuclear norm and the state-of-the-art algorithms allow us computing 
the nuclear norm of matrices of large dimensions. This suggests the possibility that 
new algorithms could be developed which would allow us to apply our method on 
larger tensors. Thus, this paper presents the first step in a new convex optimization 
approach to low rank tensor recovery. 


1.3. Some notation. We write vectors with small bold letters, matrices and tensors 
with capital bold letters and sets with capital calligraphic letters. The cardinality 
of a set S is denoted by |5|. 

For a matrix A G and subsets I C [m], J C \n] the submatrix of A with 

columns indexed by X and rows indexed by J is denoted by Ax,y. A set of all 
order-fc minors of A is of the form 


{det(Ax^y) : Z C H , ^ C [n], \I\ = \J\ = k) . 
The Frobenius norm of a matrix X S jg given as 


l|X|lp = 


\ 


i=l 3 = 1 




min{m,n} 



2=1 


where the Ui list the singular values of X. The nuclear norm is given by ||X||_^ = 
(Ji. It is well-known that its unit ball is the convex hull of all rank-one 
matrices of unit Frobenius norm. 
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The vectorization of a tensor X G xn 2 x---xnd -g denoted by vec(X) G 
The ordering of the elements in vec(X) is not important as long as it remains 
consistent. Fibers are a higher order analogue of matrix rows and columns. For 
fc G [d], the mode-fc fiber of a dth-order tensor is obtained by fixing every index except 
for the fc-th one. The Frobenius norm of a dth-order tensor X G K"ixn 2 x---xnd jg 
defined as 


|X||p = 




ni 712 

EE- 

*1 = 1 *2 = 1 


E x?,, 


*d=i 


Matricization (also called flattening) is the operation that transforms a tensor into 
a matrix. More precisely, for a dth-order tensor X G K"ix»* 2 x---xrid g^j^d an ordered 
subset iS C [d], an 5-matricization X*^ G ROfegs is defined as 

vS _ Y 

^ (*fc)fce5 ,(*<)? £5= *l*2...*dl 


i.e., the indexes in the set S define the rows of a matrix and the indexes in the set 

= [d] \5 define the columns. For a singelton set S = {*}, for z G [d], we call the 
iS-matricization the z-th unfolding. Notice that every iS-matricization of a rank-one 
tensor is a rank-one matrix. Conversely, if every iS-matricization of a tensor is a 
rank-one matrix, then the tensor is of rank one. This is even true, if all unfoldings 
of a tensor are of rank one. 

We often use MATLAB notation. Specifically, for a dth-order tensor X G 
IR”iX” 2 x---xnd^ we write X(:,fc) for the (d—l)-order subtensor in K"iX'"Xnd-i 
obtained by fixing the last index Ud to k. For simplicity, the subscripts aia 2 ■ ■ ■ cud 
and /3i/ 32 ■ ■ ■ Pd will often be denoted by a and /3, respectively. In particular, 
instead of writing Xaia 2 ...ad^l 3 ih---Pd^ we often just write XaXp. Below, we will use 
the grevlex ordering of monomials indexed by subscripts a, which in particular 
requires to define an ordering for such subscripts. We make the agreement that 
a;il...ll > Xii,,,i 2 > • ■ • > Xii,,,in^ > a;ill ...21 > . . • > Xnin 2 ...nd- 


1.4. Structure of the paper. In Section we will review the basic definition and 
properties of theta bodies. Section considers the matrix case. We introduce a 
suitable polynomial ideal whose algebraic variety is the set of rank-one unit Frobenius 
norm matrices. We discuss the corresponding d/j-norms and show that they all 
coincide with the matrix nuclear norm. The case of 2 x 2-matrices is described in 
detail. In Section|^we pass to the tensor case and discuss first the case of order-three 
tensors. We introduce a suitable polynomial ideal, provide its reduced Grobner 
basis and define the corresponding d^-norms. We additionally show that considering 
matricizations corresponding to the TT-format will lead to the same polynomial 
ideal and thus to the same dj,-norms. The general dth-order case is discussed at 
the end of Section]^ Here, we define the polynomial ideal Jd which corresponds to 
the set of all possible matricizations of the tensor. We show that a certain set of 
order-two minors forms the reduced Grobner basis for this ideal, which is key for 
defining the d^-norms. We additionally show that polynomial ideals corresponding 
to different tensor formats (such as TT format or Tucker/HOSVD format) coincide 
with the ideal Jd and consequently, they lead to the same df.-norms. In Section 
we discuss the convergence of the sequence of the unit-dfc-norm balls to the unit 
tensor nuclear norm ball. Section briefly discusses the polynomial runtime of the 
algorithms for computing and minimizing the d^-norms showing that our approach 
is tractable. Numerical experiments for low rank recovery of third-order tensors are 
presented in Section which show that our approach successfully recovers a low 
rank tensor from incomplete Gaussian random measurements. Appendix [A| discusses 
some background from computer algebra (monomial orderings and Grobner bases) 
that is required throughout the main body of the article. 
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2. Theta bodies 

As outlined above, we will introduce new tensor norms as relaxations of the 
nuclear norm in order to come up with a new convex optimization approach for 
low rank tensor recovery. Our approach builds on theta bodies, a recent concept 
from computational algebraic geometry, which is similar to Lasserre relaxations [44j . 
In order to introduce it, we first discuss the necessary basics from computational 
commutative algebra. For more information, we refer to [Hill] and to the appendix. 

For a non-zero polynomial / = in K [x] = K [cci, a; 2 : ■ • ■ i a;„] and a 

monomial order >, we denote 

a) the multidegree of / by multideg (/) = max (a G Z" g : Oq, ^ O), 

b) the leading coefhcient of / by LC (/) = amuitideg(/) G K, 

c) the leading monomial of / by LM (/) = , 

d) the leading term of / by LT (/) = LC (/) LM (/). 

Let J C K [x] be a polynomial ideal. Its real algebraic variety is the set of all points 
in X G K" where all polynomials in the ideal vanish, i.e., 

m (J) = {x G R” : /(x) = 0, for all / G J}. 

By Hilbert’s basis theorem m every polynomial ideal in R [x] has a finite generating 
set. Thus, we may assume that J is generated by a set = {/i, / 2 , • ■ •, fk} of 
polynomials in R [x] and write 

J = = (^{fi}ie[k]) or simply J = {T) . 

Its real algebraic variety is the set 

^ (d) = {x G R" : /i(x) = 0 for all i G [/c]}. 

Throughout the paper, R [x]^ denotes the set of polynomials of degree at most k. A 
degree one polynomial is also called linear polynomial. A very useful certificate for 
positivity of polynomials is contained in the following definition |2fij . 

Definition 2.1. Let J be an ideal in R [x]. A polynomial / G R [x] is k-sos mod J if 
there exists a finite set of polynomials hi, / 12 ,..., h* G R [x]^, such that / = 
mod J, i.e., if / — G J. 

A special case of theta bodies was first introduced by Lovasz in [47] and in full 
generality they appeared in [201 • Later, they have been analyzed in [Him- The 
definitions and theorems in the remainder of the section are taken from |26j . 

Definition 2.2 (Theta body). Let J C R [x] be an ideal. For a positive integer k, 
the k-th theta body of J is defined as 

THfe (J) := {x G R" : / (x) > 0 for every linear / that is fc-sos mod J}. 

We say that an ideal J C R [x] is THk-exact if TH^ (J) equals conv (j/r (J)), the 
closure of the convex hull of (J). 

Theta bodies are closed convex sets, while conv { 1 % (J)) may not necessarily be 
closed and by definition, 

THi (J) 3 TH 2 (J) D .. . D conv (J)). (4) 

The theta-body sequence of J can converge (finitely or asymptotically), if at all, 
only to conv {ur (J)). More on guarantees on convergence can be found in [26l [27| . 
However, to our knowledge, none of the existing guarantees apply to the cases 
discussed below. 

Given any polynomial, it is possible to check whether it is fc-sos mod J using 
a Grobner basis and semidefinite programming. However, using this definition in 
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practice requires knowledge of all linear polynomials (possibly infinitely many) that 
are k-sos mod J. To overcome this difficulty, we need an alternative description of 
THfc (J) discussed next. 

As in [2], we assume that there are no linear polynomials in the ideal J. Oth¬ 
erwise, some variable Xi would be congruent to a linear combination of other 
variables modulo J and we could work in a smaller polynomial ring K [x*] = 
M.[x\,X 2 t ■ ■ ,Xi-\,Xi+\,... ,Xn]- Therefore, K[x]^/J = K[x]^ and {l-fJ,a;i-|- 
J,... ,Xn + J} can be completed to a basis S of K [x] /J. Recall that the degree 
of an equivalence class / -b J, denoted by deg (/ -b J), is the smallest degree of an 
element in the class. We assume that each element in the basis B = {fi -b J} of 
R [x] / J is represented by the polynomial whose degree equals the degree of its equiv¬ 
alence class, i.e., deg {fi + J) — deg (fi). In addition, we assume that B = {fi + J} 
is ordered so that /i+i > fi, where > is a fixed monomial ordering. Further, we 
define the set Bk 

Bk :={/ +JGS:deg(/ +J) <fc}. 


Definition 2.3 (Theta basis). Let J C R [x] be an ideal. A basis B = {f^ + J, fi + 
J,...} of R [x] / J is a 0-basis if it has the following properties 

1) Bi = {1 + J,Xi + J,... ,Xn + J}, 

2) if deg {fi -b J), deg {fj + J) < k then fifj -b J is in the R-span of B 2 k- 


As in I2l[26] we consider only monomial bases S of R [x] / J, i.e., bases B such 
that fi is a monomial, for all fi + J € B. 

For determining a 0-basis, we first need to compute the reduced Grobner basis Q 
of the ideal J, see Definitions |A.2| and |A.3| The set B will satisfy the second property 
in the definition of the theta basis if the reduced Grobner basis is with respect to 
an ordering which hrst compares the total degree. Therefore, throughout the paper 


we use the graded reverse monomial ordering (Definition A.l) or simply grevlex 
ordering, although also the graded lexicographic ordering would be appropriate. 

A technique to compute a 0-basis S of R [x] / J consists in taking B to be the set 
of equivalence classes of the standard monomials of the corresponding initial ideal 


Jinitial = {{mf)}f^j) = ({LT(g.)},6 


where Q = (gi, 52 , • ■ ■,5s) is the reduced Grobner basis of the ideal J. In other 
words, a set B = {fo + J, fi + J,...} will be a 0-basis of R [x] / J if it contains all 
fi + J such that 

1 ) fi is a monomial 

2 ) fi is not divisible by any of the monomials in the set {ll£{gi) : i G [s]}. 

The next important tool we need is the combinatorial moment matrix of J. To 

this end, we fix a 0-basis B = {fi -b T} of R [x] / J and define to be the column 

vector formed by all elements of Bk in order. Then [x]g^ ^ square matrix 

indexed by Bk and its (i, j)-entry is equal to fifj + J. By hypothesis, the entries of 
[x]g^ [x]g^ lie in the R-span of B2k- Let be the unique set of real numbers 

such that fifj +J = E/,+jgB 2 , Ui + J)- 

The theta bodies can be characterized via the combinatorial moment matrix 
as stated in the next result from [26) . which will be the basis for computing and 
minimization the new tensor norm introduced below via semidefinite programming. 


Definition 2.4. Let J,B and {A( be as above. Let y be a real vector indexed by 
B 2 k with 1/0 = 1 ) where 50 is the first entry of y, indexed by the basis element 1 -b J. 
The k-th combinatorial moment matrix (y) of J is the real matrix indexed by 
Bk whose (f,j)-entry is [Mg, (y)]^_^. = XljVi- 
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Theorem 2.5. The k-th theta body of J, TH/j (J), is the closure of 

Qb^ (J) = ttk- {y e : Mg, (y) ^ 0, yo = 1} , 

where denotes the projection onto the variables yi = y^^+j,... ,yn = yxr^+j- 
Algorithm shows a step-by-step procedure for computing THfc(J). 


Algorithm 1 Algorithm for computing THfe(J) 


Input: An ideal J S M [x] = M [xi,X 2 ,... 

Compute the reduced Grobner basis for the ideal J 

Compute a 0-basis B = Si U S 2 U .. . = {/o + J, fi + J,...} of K [x] / J (see 
Definition 2.3) 

Compute the combinatorial moment matrix Mg^ (y): 

(1) [x]g^ = {all elements of Bk in order} 

(2) (Xg^)._^. = (^[x]g^ [x]g^^^^ = fifj + J = J2fi+jeB2k 

(3) [Mg^ (y)]ij = J2f,+jeB2k 
Output: THfc (J) is the closure of 

Qsfc {J) = ti-H" {y e : Mgj^ (y) ^ 0, yo = 1 } ■ 


3 . The matrix case 

As a start, we consider the matrix nuclear unit norm ball and provide hierarchical 
relaxations via theta bodies. The fc-th relaxation defines a matrix unit 0fc-norm ball 
with the property 

||X||g^ < ||X|lg^^^ for all X e and all fc G N. 

However, we will show that all these 0fc-norms coincide with the matrix nuclear 
norm. 

The first step in computing hierarchical relaxations of the unit nuclear norm 
ball consists in finding a polynomial ideal J such that its algebraic variety (the set 
of points for which the ideal vanishes) coincides with the set of all rank-one, unit 
Frobenius norm matrices 

MJ) = {X G : llXllg = 1, rank(X) = l} . (5) 

Recall that the convex hull of this set is the nuclear norm ball. The following lemma 
states the elementary fact that a non-zero matrix is a rank-one matrix if and only if 
all its minors of order two are zero. 

For notational purposes, we define the following polynomials in M [x] = 0 : 12 , 

. . . , 

m n 

^nd fijklii^ — ^ij^kl 

i=l j = l 

ioi 1 < i < k < m, 1 < j < I < n. (6) 

Lemma 3.1. Let X G ]R'"^"'\{0}. Then li. is a rank-one, unit Frobenius norm 
matrix if and only if 

X G 7^ := {X : y(X) = 0 and f^,klO^) = 0 for all i<k,j < 1}. 


(7) 
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Proof. If X S ig a rank-one matrix with ||X||i? = 1, then by definition there 

exist two vectors u G and v G M" such that Xu = UiVj for all i G [ml, j G [n] 
and |!u ||2 = ||v ||2 = 1. Thus 

— UiVjUjiVi UiViUj^Vj = 0 

m n m n 

55 = 55 55 

2=1 j — 1 2=1 j — ^ 


For the converse, let X.^ represent the i-th column of a matrix X G 7^. Then, for 
all j,l G [u] with j < Z, it holds 


^ml * '^■j XjYij • X.^ — 


X\jXjYii XiiXjjij 


XmjXjjii X^j Xjyil 


= 0 , 


since X^jX^i = X^Xj^j for all i G [m — 1] by definition of TZ. Thus, the columns 
of the matrix X span a space of dimension one, i.e., the matrix X is a rank-one 
matrix. From ^ij ~ 1 = 0 it follows that the matrix X is normalized, 

i.e., ||X|i^ = l. □ 


It follows from Lemma [3.I| that the set of rank-one, unit Frobenius norm matrices 
coincides with the algebraic variety (Jm„„) for the ideal generated by the 

polynomials g and fiju, i.e., 

= {Gm^„) with 

= {^(x)} U {fijkii^) ■■ 1 < i < k < m, 1 < j < I < n}. (8) 

Recall that the convex hull of the set 7^ in Q forms the unit nuclear norm ball and 
by definition of the theta bodies, 

conv (Jm„„)) C ... C THfc+i C •.. C THi ■ 

Therefore, the theta bodies form closed, convex hierarchical relaxations of the 
matrix nuclear norm ball. In addition, the theta body TH/j (Jm„„) is symmetric, 
THfc (Jm„„) = — THfe {JMmn)- Therefore, it defines a unit ball of a norm that we 
call the 6k-norm. 

The next result shows that the generating set of the ideal Jm„„ introduced above 
is a Grobner basis. 


Lemma 3.2. The set forms the reduced Grobner basis of the ideal JM„zn 

with respeet to the grevlex order. 


Proof. The set is clearly a basis for the ideal Tm„„. By Proposition |A.8 


in the appendix, we only need to check whether the S'-polynomial, see Definition 
A.6 satisfies S {p, q) 0 for all p,q G GM^n whenever the leading monomials 

LM (p) and LM {q) are not relatively prime. Here, S (p, q) —0 means that 


S (p, q) reduces to 0 modulo see Definition A.5 


Notice that LM(p) = xh and GM.{fijki) = xuXkj are relatively prime, for all 
1 < i < k < m and 1 < j < I < n. Therefore, we only need to show that 
Sifijki, fijki) 0 whenever the leading monomials hM{fijki) and LM(/j-.^,-) 

are not relatively prime. First we consider 


/yfe/(x) = XiiXkj - XijXki and fffkii^) = ~ 
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for 1 < i < k < k < m, 1 < j < j < I < n. The 5'-polynomial is then of the form 

SO that S{fijki, f^jki) remaining cases are treated with similar 

arguments. 

In order to show that Givimn i® ^ reduced Grobner basis (see Definition A.3), we 
first notice that LC(/) = 1 for all / G In addition, the leading monomial 

of f G Ghimn is always of degree two and there are no two different polynomials 
fi, fj G Gm^„ such that LM(/i) = LM{fj). Therefore, is the reduced Grobner 

basis of the ideal Jm„„ with respect to the grevlex order. □ 


The Grobner basis Gm^„ of Jm„„ = {Gm^„} yields the 0-basis of K[x]/Jm„„. 
For the sake of simplicity, we only provide its elements up to degree two, 

^1 = {I + JMrrmyXll + JMmnjXl2 + ^ XmU + J 

B2=BiU {x.jXki + Jm„„ ■ {i,j,k,l) G Sb2} , 

where k,l) : 1 < i < k < m,l < j < I < Given the 6- 

basis, the theta body TH^ is well-defined. We formally introduce an associ¬ 

ated norm next. 


Definition 3.3. The matrix 9k-norm, denoted by is the norm induced by 

the /c-theta body THf, i-e., 

||X||g^ = inf{r :XGrTHfc(JM_)}. 

The 0fc-norm can be computed with the help of Theorem |2.5[ i.e., as 

|jX||g^ = mint subject to X G tQs^iJMmn)- 

Given the moment matrix [y] associated with Jm„^„ , this minimization program 
is equivalent to the semidefinite program 


min t 
teR.yeR®*: 


subject to Mg^ [y] ^ 0, yo = t, YBi = X. 


(9) 


The last constraint might require some explanation. The vector yg^ denotes the 
restriction of y to the indices in Bi , where the latter can be identified with the set 
[m] X [n] indexing the matrix entries. Therefore, yg^ = X means componentwise 
Vxii+J = Xii,yxi 2 +J = Xi 2 , ..., yx^„+J = ^mn- For the purpose of illustration, 
we focus on the 0i-norm in in Section 3.1 below, and provide a step-by-step 
procedure for building the corresponding semidefinite program in (21). 

Notice that the number of elements in Bi is mn + 1, and in B 2 \Bi is . 

— 1 ^ , i.e., the number of elements of the 0-basis restricted to the 

degree 2 scales polynomially in the total number of matrix entries mn. Therefore, 
the computational complexity of the SDP in (21) is polynomial in mn. 

We will show next that the theta body THi(J) and hence, all THfe(J) for A: G N, 
coincide with the nuclear norm ball. To this end, the following lemma provides 
expressions for the boundary of the matrix nuclear unit norm ball. 


Lemma 3.4. Let Oc (Or) denote the set of all matrices M G with or¬ 
thonormal columns (rows), i.e., Oc = {M G : M^M = !„} and Or = 

{M G : MM^ = !„}. Then 

{X G K™''" : ||X||^ < 1} = {X G K™''" : tr (MX) < 1, for all M G Oc U . 

( 10 ) 


Remark 1. Notice that Oc = 0 for m > n and O^ = 0 for m < n. 
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Proof. If suffices to treat the case m < n because ||X||^ = ||X^||^ for all matrices 
X, and M e Or if and only if G Oc- Let X G such that ||X||^ < 1 and 

let X = USV^ be its singular value decomposition. For M G Oc, the spectral 
norm satisfies ||M|| < 1 and therefore, using that the nuclear norm is the dual of 
the spectral norm, see e.g. [n p. 96], 

tr(MX)<||M||.||X|U<||X|L<l. 

For the converse, let X G be such that tr (MX) < 1, for all M G Oc- Let 

X = US V denote its reduced singular value decomposition, i.e., U, S G 
and V G with U'^U = = v’^V = I„. Since M := G Oc, it 

follows that 

1 > tr(MX) = tr(VU'^US = tr(S) = ||X||^ . 

This completes the proof. □ 


Next, using Lemma 3.4 we show that the theta body THi(J) equals the nuclear 
norm ball. This result is related to Theorem 4.4 in m- 

Theorem 3.5. The polynomial ideal Jn^n defined in ([^ is THi-exact, i.e., 

THi {JMr„„) = conv (x : ^(x) = 0, fijkii^) = 0 for all i <k,j <l). 

In other words, 

{X G K™''” : X G THi (JM„r)} = {X G : ||X||^ < 1} . 

Proof. By definition of THi(Jm„„), it is enough to show that the boundary of the 


means 


unit nuclear norm can be written as 1-sos mod which by Lemma 3.4 

that the polynomial 1 — is 1-sos mod for all M G Oc U O^. 


m case m > n, 


We start by fixing M = i'^ ‘^^se m < n and M = (I„ O) i 

where Ik G is the identity matrix. For this choice of M, we need to show that 

1 ~ is 1-sos mod where £ = min{m,n}. Note that 

, n V 2 


1 

1 — 2_^ “2 

i=l 






Z=1 


2=1 i = l 


i<j<£ 


2 {xiiXjj xijXji) 


i<j<£ 


2=1 ^>=771+1 


E E 

2 =n+l j—1 


X^- 


1 ^ ^ Xii j —1 2 ^ ^ Xii + ^ ^ ^ ^ • 

V 2 = 1 / 2=1 2=1 j = l 


— 1 2 ^ ^ Xii “h 2 ^ ^ XiiXjj H" ^ ^ ^ 22 ; 

2=1 i—1 


m n 


m n 


1-EE4+E E 4+ E E4 = i-EE 


x^- 


2=1 i=i 


2=1 2 =n+l j—1 


i=i 3=1 


= 1 - E (4+4)-E4, 

i<j<£ 2=1 


Since 
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and 


'y ' ^ji) 2 y ^ {XiiXjj XijXji) 

i<j<^ 

= (xjj■+a;jj) — 2 XiiXjj. 


i<j<^ 


i<j<i 


Therefore, 1 — J2i=i^ii i® 1-sos mod since the polynomials 1 — J2i=i^ 


i=l 

Xij — Xji, Xij, and Xji are linear and the polynomials 1 — 

2 {xiiXjj — XijXji) are contained in the ideal, for all i < j < £. 

Next, we define transformed variables 

- I sr=i MikXkj if m < n, 

XikMkj iim> n. 

Since x)j is a linear combination of {xkj}^^i U {xik}k=ij for every i G [m] and 

j G [n], linearity of the polynomials 1 — X]i=i ^ij ~ ^'ji^ ^'ij^ ^ is preserved, 

for all i < j■ It remains to show that the ideal is invariant under this transformation. 
For the polynomial 1 — ff^i® i® clear since M G has unitary 

columns in case when m < n and unitary rows in case m > n. In the case of m < n 
the polynomial x)^x'jj — x)jx'j^ is contained in the ideal J since 

m m 

44-44 = EE ^kj^li^ 

k^l1^1 

and the polynomials XkiXij — XkjXu are contained in J for all i < j < m. Similarly, 
in case m > n the polynomial ■ — x'^,x',^ is in the ideal since 


44 - ^ E E ^kiMij [xikXji - xaXjk) 

k=l 1=1 

and polynomials XikXji — xuXjk are in the ideal, for all i < j < n. 


□ 


The following corollary is a direct consequence of Theorem 3.5 and the nestedness 
property Q of theta bodies. 

Corollary 3.6. The matrix 9i-norm coincides with the matrix nuclear norm, i.e., 
||X||, = ||X||,^, VanxeM’”"". 

Moreover, 


THi = TH 2 (Jm„„) = • • • = conv {i/r (Jm„„)) ■ 

Remark 2. The ideal ([^ is not the only choice that satisfies ([^. For example, in 
m the following polynomial ideal was suggested 

/ m n \ 

J = / {x^j - ^ 4 - 1, ^ uf - 1 \ (11) 

\ 1=1 t=l / 

in K [x, u, v] = M [xn,..., Xmn, uij ■ • ■, Um, vi, ■. ■, Un]. Some tedious computations 
reveal the reduced Grobner basis Q of the ideal J with respect to the grevlex (and 
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grlex) ordering, 


^ = { 


= i 5i = Xij - UiV. 


■■ i & [m\, j G [n]^[j <92 = - 1,93 = Y1 


= > vi -1 


2 = 1 




U = XijUk - XkjUi : 1 < z < fc < TO, j e [n]| 

U = XijVk - XikVj : i G [m], I < j < k <n^ 

{ n I r ^ 

9l=Yl - Wz : * e H I y ^ XijUi - Vj : j G [n] 

U 1 98^ ^ X! : 1 < j < j < TO > 


fc = l 


U 1^9'^ ^ X! - ViVj : 1 < i < j < 


U ^ 5*10 = XI ^ ^ f U ^ 5n = ■■‘^<3 <n 

y j=i j t i=i > 

U {512’^’* = Xij^ki - XiiXkj : l<i<k<m, l<j<l<n^ 


Uyi3 = x?i-XX4 + X^*+X'^i-1 r 

i—2 j—2 1=2 i=2 


( 12 ) 


Obviously, this Grobner basis is much more complicated than the one of the ideal 
Jm„„ introduced above. Therefore, computations (both theoretical and numerical) 
with this alternative ideal seem to be more demanding. In any case, the variables 
{"^iYiLi {ol^i only auxiliary ones, so one would like to eliminate these 
from the above Grobner basis. By doing so, one obtains the Grobner basis 
defined in ^ . Notice that X),™! E”=i “ 1 = 5i3 + E™ 2 5!o + E "=2 5n together 
with { 912 ^’’’} form the basis 


3.1. The 01 -norm in For the sake of illustration, we consider the specific 

example of 2 x 2 matrices and provide the corresponding semidefinite program 
for the computation of the 0i-norm explicitly. Let us denote the corresponding 
polynomial ideal in K [x] = K [xii,Xi 2 ,X 2 i,X 22 ] simply by 

J = Jm 22 = {X 12 X 21 - X 11 X 22 , xfi + XI 2 + xh + XI 2 - 1 ) (13) 

The associated algebraic variety is of the form 

m (J) = {x : xi2a;2i = a;iia:22, a^ii + Xia + a;2i +0:22 = 1} 

and corresponds to the set of rank-one matrices with |jX||i? = 1. Its convex hull 
consists of matrices X G with ||X||* < 1. According to Lemma 

Grobner basis ^ of J with respect to the grevlex order is 

S = {ffi = X 12 X 21 - X 11 X 22 , 92 = x\i + x \2 + xl^ + XI 2 - 1 } 
with the corresponding 0-basis B of K [x] / J restricted to the degree two given as 

Bi = {1 + T,a:ii + J, X12 + J,X 2 i + J, X22 + J} 

B2 = Bi U {a:iia:i2 + J, X 11 X 21 + J, X 11 X 22 + J, a^?2 + J, a;i2a;22 + J, 

X21 + J, X21X22 + J-i X22 + t/}. 


3.2, the 
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Table 1. Linearization of the elements oi B 2 = {/ + •/} for matrix 
2x2 case. 


1 + J Xii + J X12 + J X21 + J X22 + J X11X12 + J X11X21 + J 
2/0 a^ii X12 X21 X22 2/1 2/2 

2:11X22 + J X12 + J X12X22 + J X21 + J X21X22 + J X22 + J 

2/3 Vi 2/5 2/6 2/7 2/8 


The set B 2 consists of all monomials of degree at most two which are not divisible by 
a leading term of any of the polynomials inside the Grobner basis Q. For example, 
X 11 X 12 + J is an element of the theta basis B, but + J is not since xf^ is divisible 
by LT(g 2 ). 

Linearizing the elements of B 2 results in Table [T 1 where the monomials / in the 

I—■ rp 

first row stand for an element f + J G S 2 . Therefore, [x]g^ = (1, Xn,X 12 , X 21 , X 22 ) 


and the following combinatorial moment matrix Mg^ (x, y), see Definition 2.4 
given as 


Mgj (x,y) = 


For instance, the entry (2, 2) of 
X 22 + 1 + J, where we exploit the second property in Definition 


IS 


2/0 

Xll 

X 12 

X 21 

X 22 


Xll 

-2/4 - 2/6 - 

2/8 + 2/0 2/1 

2/2 

ya 


X 12 

2/1 

2/4 

2/3 

y5 


X 21 

y2 

2/3 

2/6 

2/7 


X 22 

2/3 

2/5 

2/7 

ys. 


2) of 

Mbi Mbi 

is of the form 

x?i 4 

- J = 

—X 


2.3 


12 -^21 

and the fact that 


(/2 S J. Replacing 2:^2 + >/ by 2 / 4 , etc. as in Tableyields the stated expression for 

Mgi (x,y)2_2- 


By Theorem |2.5[ the first theta body THi (J) is the closure of 

Qbi (J) = TTx { (x, y) e : Mgj (x, y) ^ 0, 2/0 = 1} , 

where tTx represents the projection onto the variables x, i.e., the projection onto Xn, 
X 12 , X 21 , X 22 . Furthermore, 0i-norm of a matrix X G induced by the THi (J) 
and denoted as can be computed as 


which is equivalent to 


inf t 

teK.yeR® 


s.t. M = 



= inf t s.t. X G tQgj 

(J) 




(14) 

t 

Xll 

X 12 

X 21 

X 22 



Xll 

-y4 -ye-ys + t 

yi 

y2 

ye 



X 12 

yi 

y4 

ys 

ye 

4 0 . 

(15) 

X 21 

y2 

ys 

ye 

yi 



_X22 

ys 

Vb 

yr 

2/8 . 




Notice that trace(M) = 2t. By Theorem 3.5 the above program is equivalent 
to the standard semidefinite program for computing the nuclear norm of a given 
matrix X G 


min - (trace (W) + trace (Z)) s.t. 
w.z 2 


IFii 

1X12 

Xll 

X12 

1X12 

IF22 

X21 

X22 

Xll 

X21 

2^11 

Z12 

X22 

X22 

Z12 

Z22 


^ 0 . 


Remark 3. In compressive sensing, reconstruction of sparse signals via £i-norm 
minimization is well-understood, see for example unmiiia. It is possible to provide 
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hierarchical relaxations via theta bodies of the unit ^i-norm ball. However, as in 
the matrix scenario discussed above, all these relaxations coincide with the unit 
.^i-norm ball, [57] , 


4. The tensor dfc-NORM 

Let us now turn to the tensor case and study the hierarchical closed convex 
relaxations of the unit tensor nuclear norm ball defined via theta bodies. Since in the 
matrix case all d^-norms are equal to the matrix nuclear norm, their generalization 
to the tensor case may all be viewed as natural generalizations of the nuclear norm. 
We focus mostly on the di-norm whose unit norm ball is the largest in a hierarchical 
sequence of relaxations. Unlike in the matrix case, the di-norm defines a new tensor 
norm, that up to the best of our knowledge has not been studied before. 

The polynomial ideal will be generated by the minors of order two of the unfoldings 
- and matricizations in the case d > 4 - of the tensors, where each variable corresponds 
to one entry in the tensor. As we will see, a tensor is of rank one if and only if all 
order-two minors of the unfoldings (matricizations) vanish. While the order-three 
case requires to consider all three unfoldings, there are several possibilities for the 
order-d case when d > 4. In fact, a dth-order tensor is of rank one if all minors of all 
unfoldings vanish so that it may be enough to consider only the unfoldings. However, 
one may as well consider the ideal generated by all minors of all matricizations or 
one may consider a subset of matricizations including all unfoldings. Indeed, any 
tensor format - and thereby any notion of tensor rank - corresponds to a set of 
matricizations and in this way, one may associate a dfc-norm to a certain tensor 
format. We refer to e.g. [52] for some background on various tensor formats. 
However, as we will show later, the corresponding reduced Grobner basis with 
respect to the grevlex order does not depend on the choice of the tensor format. We 
will mainly concentrate on the case that all matricizations are taken into account 
for defining the ideal. Only for the case d = 4, we will briefly discuss the case, that 
the ideal is generated only by the minors corresponding to the four unfoldings. 

Below, we consider first the special case of third-order tensors and continue then 
with fourth-order tensors. In Subsection |4.2| we will treat the general dth-order case. 

4.1. Third-order tensors. As described above, we will consider the order-two 
minors of all the unfoldings of a third-order tensor. Our notation requires the 
following sets of subscripts 

= {(a, /3) : 1 < oi < /3i < ni, 1 < ,02 < 02 < R 2 , 1 < ^03 < as < R 3 }, 

^2 = {(a, /3) : 1 < oi < /3i < ni, 1 < /32 < 02 < R 2 , 1 < 03 < /33 < R 3 } , 

53 = {(a, /3) : 1 < oi < /3i < ni, 1 < 02 < /32 < R 2 , 1 < /33 < 03 < 713 }, 

Si = {(q:,/3) : {a,f3) G Si and aj ^ fij, for all j G [3]} , for all i G [3]. 

The following polynomials /(“’d) K [x] = K [xm, X 112 ,..., x„j„ 2 „ 3 ] correspond 
to a subset of all order-two minors of all tensor unfoldings, 

/(a,/3) (x) = XaXi3 - XavpXaAp, («, ,9) G 5 := U >02 U S^ 

ni 712 ^3 

53(x) 

i—1 j—1 k—1 

where [cx\J (3]^ = max{ai,/3i} and [ex/\ (3]^ = min{ai,/3i}. In particular, the 
following order-two minor of is not contained in : (a, j3) G 5} 

f = XaXp- x^Xp, where a = (ai,,02,/33) ,,9 = (/ 3 i, 02 , 03 ) and {ex, (3) & S 3 . 
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We remark that in real algebraic geometry and commutative algebra, polynomials 
/(a:,/3) known as Hibi relations, see |33]. 


Lemma 4.1. A tensor X S K"i><” 2 xn 3 ^ rank-one, unit Frobenius norm tensor 

if and only if 

53 (X) =0 and =0 for all {a,f3)GS. (16) 


Proof. Sufficiency of (16 1 follows directly from the definition of the rank-one unit 
Frobenius norm tensors. For necessity, the first step is to show that mode-1 fibers 
(columns) span one-dimensional space in . To this end, we note that for (32 < ci 2 
and (3^ < as, the fibers X.^^^a and satisfy 



^13233 


^Ia 2 a 3 

Xnx0l20l2 

^23233 

+ ^ni3233 

^ 2 a 20 i 3 


P^nx32 33. 


V 

'^ 7110 : 20:3 


^2j3232^‘nia2a2 T ^2l32P3^nx a2CK3 
^ni 132 33^^1012013 + Xm3233^'nia2a3 


where we used that /(“’^^(X) = 0 for all {a, (3) S S. From gs (X) = 0 it follows 
that the tensor X is normalized. 

Using similar arguments, one argues that mode-2 fibers (rows) and mode-3 fibers 
span one dimensional spaces in and M”^, respectively. This completes the 
proof. □ 


A third-order tensor X G K"i ><" 2 x 733 jg ggg jf ggpj jf gp three unfoldings 
XfU e M"iX" 2 " 3 ^ x{ 2 } g £" 2 X 731773 ^ ggj x^^l g K" 3 xr 7 i 732 gpg pggk.Qne matricos. 
Notice that f^°‘P'> (X) = 0 for all {a, (3) € Si is equivalent to the statement that the 
Ath unfolding X{^> is a rank-one matrix, i.e., that all its order-two minors vanish, 
for all i G [3]. 

In order to define relaxations of the unit tensor nuclear norm ball we introduce 
the polynomial ideal J 3 C M [x] = K [xm, 3 : 112 ,. •., a:„j„ 2 „ 3 ] as the one generated 

by 

G 3 = (x) : {cx,P) g U {gs (x)}, (17) 

i.e., J 3 = (Gs). Its real algebraic variety equals the set of rank-one third-order 
tensors with unit Frobenius norm and its convex hull coincides with the unit tensor 
nuclear norm ball. The next result provides the Grobner basis of J 3 . 


Theorem 4.2. The basis Gs defined in ( |17[ ) forms the reduced Grobner basis of the 
ideal J 3 = {G 3 ) with respect to the grevlex order. 


Proof. Similarly to the proof of Theorem 3.2 we need to show that S {p, q) 0 
for all polynomials p,q G Gs whose leading terms are not relatively prime. The 
leading monomials with respect to the grevlex ordering are given by 

LM(53) = a:?ii 


and LM(/^“’^^) = XaXp, (a, (3) G S. 


The leading terms of gs and are always relatively prime. First we consider two 

distinct polynomials f,g G : {a, (3) e 53 }. Let / = f^°‘P'> and g = /(“’^) 

for [a, (3) G S 3 , where (3 = [(3i,a2, (ds). That is, 

/(x) = XaX/3 - XaWfSXaA/S, 9{^) = X^X^ - X^.^pX^^p. 
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Table 2. Matrix nuclear norms of unfoldings and 0i-norm of 
tensors X G which are represented in the second column 

as X = [X (:, 1) I X (;, 2)]. The third, fourth and fifth column 
represent the nuclear norms of the first, second and the third 
unfolding of a tensor X, respectively. The last column contains the 
numerically computed 0 i-norm. 


X e M 2 X 2 X 2 


1 

0 

0 

o' 

' 0 

0 

0 

1 

1 

0 

0 

o' 

0 

1 

0 

0 

1 

0 

0 

o' 

' 0 

0 

1 

0 

1 

0 

0 

1' 

0 

0 

0 

0 

1 

0 

0 

1' 

' 0 

1 

0 

0 


1X{1>|L ||X{2}|L IjxWl 


IIXII 


2 

2 


2 

V2 
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Since a A f3 = a A (3 and g • (q:,/3) G 52 }, then 

^ {f •) 9 ') ~ ^ T ~ ^03 d- 

Next we show that S{f,g) G J 3 , for / G : (a,/3) G 52 } and g G 

: (a,/3)GSi}. Let / = /(“’^) with ^ and g = /(“’^) 

with 3 = (/ 3 i,/ 32 ,a 3 ), where {a, [3) G 82 - Since g 

: {a,f3) g 53 }, and g {/(«./3) : {a,f3) G 5i} 

^ if ^9) = ^aAP (^~^P^aVP ^p^aVp) ~ ^otAP ^ 


For the remaining cases one proceeds similarly. In order to show that Q 3 is the 


reduced Grobner basis, one uses the same arguments as in the proof of Theorem 3.2 


□ 


Remark 4. The above Grobner basis C /3 is obtained by taking a particular subset 
of all order-two minors of all three unfoldings of the tensor X g ]j"iX" 2 xn 3 
considering the same minor twice). One might think that the 0i-norm obtained in 
this way corresponds to a (weighted) sum of the nuclear norms of the unfoldings, 
which has been used in PH [3H] for tensor recovery. The examples of cubic tensors 
X g r 2 x 2 x 2 presented in Tableshow that this is not the case. Assuming that 
01 -norm is a linear combination of the nuclear norm of the unfoldings, there exist 
a, /3, 7 G M such that a||X^^}||* -|- /3|jX^^^||* -|- yljX^^^H* = ||X||e^. From the first 
and the second tensor in Table we obtain 7 = 0. Similarly, the first and the 
third tensor, and the first and the fourth tensor give /3 = 0 and a = 0 , respectively. 
Thus, the 0i-norm does not coincide with a weighted sum of the nuclear norms of 
the unfoldings. In addition, the last tensor shows that the 0i-norm does not equal 
maximum of the norms of the unfoldings. 


Theorem |4.2| states that ^3 is the reduced Grobner basis of the ideal J 3 generated 
by all order-two minors of all matricizations of an order-three tensor. That is, J 3 is 
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generated by the following polynomials 

/(q,_^)(^) ^0.10.2013^010203 ^0110203^0102031 ^ 

/(q,_^)(^) “ 3;Q.jQ.2a33^/3i/32/33 ^ 0ia.2 03^ai02a3 i for (^Ot, f3) G T~^ ^ 

~ ~^aia2a3X0i0203 + a^^i/32a32^aia2/33 > fo^ 

where : {ct,l3) G is the set of all order-two minors of the kih. 

unfolding and 

= {(a, /3) : Ofc /?fe, a 7 ^ /3, where ak= Pk = 0: = ct^ Pe= Pe) ■ 

For {oi.,(3), x^{k}Xp{k} denotes a monomial where = ak, (3^^^ = (3k, and 
= (3e, (31^^ = ai, for all t G [d] \{fc}. Notice that /(^^^^^)(x) = /(^^_^„)(x) = 


-/, 


{k} 


,w = -/, 


{fc} 


(x), for all [a, (3) G and all k G [3]. Let us 


now consider a TT-format and a corresponding notion of tensor rank. Recall that 
a TT-rank of an order three tensor is a vector r = (ri,r 2 ) where ri = rank(X^^^) 
and r 2 = rank(X^^’^^). Consequently, we consider an ideal Js^tt generated by all 
order-two minors of matricizations X^^ and X^’^^ of the order-3 tensor. That is, 
the ideal Js^tt is generated by the polynomials 

/(Q ,g)(x) = 3:0,101203X010203 Xai0203X010203, fol (e^,/3) G 
/(q.^)(x) = XoiO2O3X0i0203 3~ XoiO203X0i02O3, fol (Q^,/3) € 

where = {{a, (3) : (ai,a2,0) ^ (/3i,/32,0), 03 ^ Ps}. 


Theorem 4.3. The polynomial ideals J 3 and Js^tt are equal. 

Remark 5. As a consequence, Qz is also the reduced Grobner basis for the ideal 
J 3 tt with respect to the grevlex ordering. 

Proof. Notice that (X^i)^ = X^’^i and therefore 

{//i)W : («>/3) € = {/S2(x) : {a, (3) G . 

Hence, it is enough to show that € Js.tt, for all {a, (3) G By definition 

of we have that 02 7 ^ (32 and (ai, 0 ,Q; 3 ) 7 ^ {(3i, 0 ,( 33 ). We can assume that 

<a 3 7 ^ (33, since otherwise = /(ap)- Analogously, a\ 7 ^ /3i since otherwise 

~ Consider the following polynomials 

/(x) = XaiO2O3X0i0203 3- X/3^a203Xoi02O3, {r^,0) G 'T~^ ^ 

ll(x) — X 1^20203X010203 3~ X 1^202 03X010203 , (/^l , (32 , 0^3 , 0 ^l, 0 ^ 2 , /^s) ^ ^ 

/l(x) = XaiO2O3X0i0203 3- XoiO203X0i02O3, 

Thus, we have that /(x) = g(x) -|- /i(x) e Js.tt- D 


4.2. The theta norm for general dth-order tensors. Let us now consider dth- 
order tensors in ]g"ix" 2 x - xnd general d > 4. Our approach relies again on the 
fact that a tensor X € K"ixra2x - xnd jg Qf j-ank-one if and only if all its matricizations 
are rank-one matrices, or equivalently, if all minors of order two of each matricization 
vanish. 

The description of the polynomial ideal generated by the second order minors 
of all matricizations of a tensor X e K"ixn 2 x - xnd unfortunately requires some 
technical notation. Again, we do not need all such minors in the generating set 
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that we introduce next. In fact, this generating set will tnrn out to be the reduced 
Grobner basis of the ideal. 

Similarly to before, the entry (oi, a 2 ,..., of a tensor X G K"i^" 2 x---xnd 
corresponds to the variable Xa^a-i-.-ad simply cCq,. We aim at introducing a set of 
polynomials of the form 

fd (^) ■— ^Q;A/3^aV/3 “f X(y.Xf3 

which will generate the desired polynomial ideal. These polynomials correspond to 
a subset of all order-two minors of all the possible dth-order tensor matricizations. 
The set S denotes the indices where a and /3 differ. Since for an order-two minor 
of a matricization the sets a and /3 need to differ in at least two indices, S is 
contained in 

:= {5 C [d] : 2 < |5| < d}. 

Given the set S of different indices, we require all non-empty subsets AI C 5 of 
possible indices which are “switched” between a and f3 for forming the minors in 
(18). This implies that, without loss of generality, 

aj > j3j , for all j G M 

Oik < fdk, for all k G S\M. 

That is, the same minor is obtained if we require that aj < j3j for all j G M. and 
ttfc > /3fc for all k G S\A4 since the set of all two-minors of X^ coincides with the 
set of all two-minors of X'^^^. 

For S G we define es '■= min{p : p G S}. The set A4 corresponds to an 
associated matricization X^. The set of possible snbsets A4 is given as 


'Ps = 


[m C 5 : |M| < L4^J}\{0}, if|5|isodd, 

[m C 5 : |Ad| < L^j} U {m : |Ad| = ,65 G m} \{0}, otherwise. 

Notice that Vs U Vs‘= U {0} U S with Vs‘= ■= {M. : S\M. G Vs} forms the power 
set of S. The constraint on the size of AI in the definition of Vs is motivated by 
the fact that the role of a and (3 can be switched and lead to the same polynomial 

/, 


(a./3) 


Thus, for S G S[ci] and M G Vs, we define a set 


:= {(a,/3) : a, = ft, for alH ^ 5 
aj > 13j , for all 3 G M 
Oik < Pk, for all k G 5\AI}. 

For notational purposes, we define 

{/!} = ■■ («,/3) e 7f’^} for 5 G 5[,]. 

Since we are interested in unit Frobenius norm tensors, we also introduce the 
polynomial 


9 d (x) = ^ 


ni n2 


rid 

■ E 

jd=i 


- 1 . 


Our polynomial ideal is then the one generated by the polynomials in 

= U {/d}U{ 5 d} CK[x] =K[xii.,,i,a:ii,.. 2 ,..., 


nin2...ndl , 


5g5| 


[d] 


i.e., Jd = (Gd)- As in the special case of the third-order tensors, not all second order 
minors corresponding to all matricizations are contained in the generating set Gd 
due to the condition ik < ik for all fc G 5 in the definition of Tf ■ Nevertheless 
all second order minors are contained in the ideal Jd as will also be revealed by 
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the proof of Theorem 4.4 below. For instance, ft.(x) = —a;i234a;2343 + a;i243a;2334 - 
corresponding to a minor of the matricization for = { 1 , 2 } - does not belong 
to 04, but it does belong to the ideal J 4 . Moreover, it is straightforward to verify 
that all polynomials in differ from each other. 

The algebraic variety of Jd consists of all rank-one unit Frobenius norm order-d 
tensors as desired, and its convex hull yields the tensor nuclear norm ball. 


Theorem 4 . 4 . The set Qd forms the reduced Grobner basis of the ideal Jd with 
respect to the grevlex order. 

Proof. Again, we use Buchberger’s criterion stated in Theorem |A.7| First notice that 
the polynomials gd and fd°‘’^'^ are always relatively prime, since LM(gd) = and 

LM(/^“’^^) = XaXfd for {oL,f3) e where S G S[d] and M. G Vs. Therefore, 

we need to show that S{fi,f 2 ) 0, for all /i, /2 G Gd\{gd} with fi ^ / 2 . To this 
end, we analyze the division algorithm on {Gd}- 

Let /i, /2 G Gd with fi ^ / 2 . Then it holds LM(/i) ^ LM(/ 2 ). If these leading 
monomials are not relatively prime, the S'-polynomial is of the form 

with {al,al,al) = {al,al,al} for all fc G [d]. 

The step-by-step procedure of the division algorithm for our scenario is presented 
in Algorithm]^ We will show that the algorithm eventually stops and that step 
2) is feasible, i.e., that there always exist k and i such that line 7 of Algorithm]^ 
holds - provided that S'* 7 ^ 0. (In fact, the purpose of the algorithm is to achieve 
the condition that in the ith iteration of the algorithm < d^’*, for all 

fc G [d].) This will show then that S(/i, / 2 ) —0. 


Algorithm 2 The division algorithm on the ideal (Gd)- 

Input: polynomials /i ,/2 G Gd 

S® = S(/l, /2) = X(y^Xiy'^X(y3 ” CCq, 1 X Q, 2 X q, 3 , 1 = 0 

while S* 7 ^ 0 do 

1 ) Let LM(S*) = x^l.ix^2.^x^3,i and NLM(S*) = |S* - LT(S*)| 

2) Find indices G {d^’*, d^’*, d^’*} such that there exist at least 

one k and at least one £ for which 

< al'^ and a]’* > s.t. Mi := ^£ G [d] : a]’* > G Vs, 


where S := G [d] : 7 ^ and let * be the remaining index 

• r 1,2 2,2 3,24 \ r 1 i 2 24 

in {a ’ ,a ,a ’ |\{a ’ ,a ’ 

= X„l,iX„2,i — Xryl-.i Ary2,iXry1-,i\/ry2,i tO Obtuiu 


3 ) Divide S* by ’ 

S’' = LC(S*) [xQ3,i (— Xfyl.i/\(y2,iXfyl.i\/(y2,i V X(yl,iXiy2,i'j 

-I- X cP.i t^ofldX cP.iy ofl.iX (y3,i — NLM(S*)] . 


4) Define 


s 


*+l ._ 2 ; 


NLM(S*). 


5) i = i + l 

end while 


Before passing to the general proof, we illustrate the division algorithm on an 
example for d = 4. The experienced reader may skip this example. 








22 


HOLGER RAUHUT AND ZELJKA STOJANAC 


Let /i(x) := = -X1112X2223+X1212X2123 € Q4 (with the correspond¬ 

ing sets S = { 1 , 2 , 3 , 4 }, M = { 2 }) and /2(x) := /f(x) = -X2111X3323 + 
a;33iia;2i23 S Q4 (with the corresponding sets S = ( 1 , 2 , 3 , 4 }, A 4 = { 1 , 2 }). We will 
show that 5 '(/i,/ 2 ) = -xiii 2 X 2223 a:: 33 ii +3:12123:^21113:3323 -^^4 0 by going throngh 
the division algorithm. 

In iteration z = 0 we set 5 ° = S{fi,f2) = -3:11123:22233:3311 + 3 ;i 2 i 23 ; 2 iii 3 ; 3323 - 
The leading monomial is LM(S'°) = 3:11123:22232:3311, the leading coefficient is 
LC( 5 '°) = — 1 , and the non-leading monomial is NLM(S'°) = 2:i2i2a:2iii2:3323- 
Among the two options for choosing a pair of indexes (q:^’°, in step 2 ), 

we decide to take = 1112 and q;^’° = 3311 which leads to the set Mq = 
{ 4 }. The polynomial Xai,oXa2,o — XQ,i,o/\ct2.oXQ,i,ova2,o then equals the polynomial 
/iiii 2 ’ 33 ii)(x) = -X1111X3312 + 3:11120:3311 G 04 and we can write 

5 ° = —1 • (^X2223 ( — 3:11113:3312 + X1112X3311) + X1111X2223X33I2 — 3:12123:21113:3323 ) ■ 

' ^ 

The leading and non-leading monomials of are LM(S'^) = 2:1111X22232:3312 and 
NLM(S'^) = X1212X2111X3323, respectively, while LC(S'^) = 1 . The only option for a 
pair of indices as in line 7 of Algorithmis = 3312 , 0 :^’^ = 2223 , so that the 
set A^i = { 1 , 2 }. The divisor Xq,i,iXq,2,i — x„i,iac(2.ix„i,ivc(2.i in the step 4 ) equals 
/ 43312 ’ 2223 )(x) = -X2212X3323 + X3312X2223 G 04 and we obtain 

= 1 • 1^X1111 ( —X2212X3323 + 3:22233:3312) + X1IIIX22I2X3323 — Xi 212 X 211 lX 3323 ^ ') . 

The index sets of the monomial Xq,iXq,2Xq, 3 = X1111X2212X3323 in satisfy 

Q!fc < ccfe < for all k G [ 4 ] 

and therefore it is the non-leading monomial of 5 ^, i.e., NLM(S'^) = X1111X2212X3323. 
Thus, LM( 5 '^) = X1212X2111X3323 and LC(S'^(/i,/2)) = — 1 . Now the only option 
for a pair of indices as in step 2 ) is = 2111 , = 1212 with AI2 = {!}■ This 

yields 

5 '^ = —1 • ^X 3323 ( —X1111X2212 + X2111X1212) + X1111X2212X3323 — XiiiiX 2212 X 3323 _ ^) . 
Thus, the division algorithm stops and we obtained after three steps 

S{fij 2 ) = s° = LC(5°)x2223/i“''’'"“^(x) + LC(5°)LC(^i)xiiii/r''’''''^(x) 

+ LC(, 5 °) LC( 5 i) LC(A 2 )x 3323 /f “'’"''"^(x). 

Thus, ^(/i,/2) 0 . 


Let us now return to the general proof. We first show that there always exist 
indices satisfying line 7 of Algorithmunless S'® = 0. We start by setting 

x“’ = x^i,iX^2,iX^3,i with Xq,!,; > x^ 2 ,i > x^s.i to be the leading monomial and 
x^" to be the non-leading monomial of S®. The existence of a polynomial h G Qd such 
that LM(/i) divides LM(S®) = x^i,ix^2,ix^3.i = x“® is equivalent to the existence 

of G a^’®, a^’®| such that there exists at least one k and at least 

one a. for which a^’® < and a]’* > If such pair does not exist in iteration i, 
we have 


- 1,2 ^ ^ 2,2 ^ - 3,2 
Ofe < afc’ < ttfc 


for all k G [d] . 


(19) 
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We claim that this cannot happen if 5® ^ 0. In fact, (19) would imply that the 
monomial = x^i,ix^2,ix^3,i is the smallest monomial xpx-^Xr^ (with respect to 
the grevlex order) which satisfies 

{/3fc,7fe,??fc} = for all k £ [d]. 

However, then x“® would not be the leading monomial by definition of the grevlex 




a 


2 ,i 


order, which leads to a contradiction. Hence, we can always find indices a 
satisfying line 7 in step 2) of Algorithm unless S'® = 0. 

Next we show that the division algorithm always stops in a finite number of steps. 
We start with iteration f = 0 and assume that S° ^ 0. We choose q:^’° 

as in step 2) of Algorithm]^ Then we divide the polynomial S® by a polynomial 
h £ Gd such that LM(/i) = Xcti,oXa3,o. The polynomial h £ Qd is defined as in step 
3) of the algorithm, i.e., 

h{x) = id ~ 3^0,1, 0 X 0 , 2,0 — Xoi.OAa^.oXoi.Ova^-O G Gd- 

The division of S° by h results in 

S° = LC(S°)(xo3.0 • + Xo1.0ao2.0Xo1.0vo2 . 0Xo3.0 -NLM(S°)J . 

' Tsi ' 

Note that by construction 

A ^ < [a^’° V a^’°] ^ for all k£[d]. (20) 

If S^ ^ 0, then in the following iteration i = 1 we can assume LM(S^) = 
Xoi.oAQ;3.oXoi.oAa2,oXo3.o. Due to (20), a pair as in line 7 of Algorithm]^ 

can be either A q:^’° or Let us assume the former. Then 

this iteration results in 

= LC(|S'^) (^Xo3,i • /j ^ + Xoi.iAct3,i3^ai.ivct3,i2;o3,i — NLM(5'*^) \ 

= 

with 

[a^’^ A ^ V ^ for all k £ [d\, and Xo3,i = Xocoyo^.o • 

Next, ii ^ 0 and LM(5'^) = XctiA/^a^,iXaiA\/a^,iXa3,i then a pair of indices 
satisfying line 7 of Algorithm must be V so that the iteration ends 

up with 

/ 1,2 2,2\ 

(CK ’ .CK ’ 1 


= LC(S'2)(, 
such that 


Pd 


+ Xol.2Act3,2Xol.2vct3.2Xo3,2 — NLM(o ) 

■V 

= 


). < A ^ V ^ for all k £ [d] , and Xo3,2 = XoLIaq^.i . 

Thus, in iteration i = 3 the leading monomial LM(5'^) must be NLM(S'°) (unless 
S'3 = 0). 

A similar analysis can be performed on the monomial NLM(S'°) and therefore 
the algorithm stops after at most 6 iterations. The division algorithm results in 

^(/i,/2 )=E (nLC(5^) 

i=0 \j=0 

where ® = —Xai.i^a^,iXai.i-^a^,i + XQ,i,iXa,2,i £ Gd and p < 5. All the cases 

that we left out above are treated in a similar way. This shows that Gd is a Grobner 
basis of Jd- 
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In order to show that Qd is the reduced Grobner basis of Jd, first notice that 
LC((/) = 1 for all g S Qd- Furthermore, the leading term of any polynomial in Qd is 
of degree two. Thus, it is enough to show that for every pair of different polynomials 
f!r ^ ftT ^ ^ Gd (related to Si, Mi and S 2 ,M 2 , respectively) it holds that 
^ with (a^/3'=) G for fc = 1,2. But this follows 

from the fact that all elements of Qd are different as remarked before the statement 
of the theorem. □ 

We define the tensor 6 *fe-norm analogously to the matrix scenario. 

Definition 4.5. The tensor Ok-norm, denoted by IMIgj., is the norm induced by the 
fc-theta body TH^ {Jd), i.e., 

||X||g^ = inf{r:XGrTHfc {Jd)} - 

The 0fc-norm can be computed with the help of Theorem |2.5[ i.e., as 
||X||g^ = mint subject to X G tQ] 3 ^{Jd)- 

Given the moment matrix Mg^ [y] associated with Jd, this minimization program is 
equivalent to the semidefinite program 

min t subject to Mg^ [y] ^ 0, yo = t, yg^ = X. (21) 

tGR.ySR^fc 

We have focused on the polynomial ideal generated by all second order minors 
of all matricizations of the tensor. One may also consider a subset of all possible 
matricizations corresponding to various tensor decompositions and notions of tensor 
rank. For example, the Tucker(HOSVD)-rank (corresponding to the Tucker or 
HOSVD decomposition) of a dth-order tensor X is a d-dimensional vector yhosvd = 
{ri,r 2 , ■ - ■ ,rd) such that = rank(Xi®i) for all i G [d], see [55]. Thus, we can 
define an ideal Jd,HOSVD generated by all second order minors of unfoldings X^^i, 
for fc G [dj. 

The tensor train (TT) decomposition is another popular approach for tensor 
computations. The corresponding TT-rank of a dth-order tensor X is a (d — 1)- 
dimensional vector ytt = {Yi,r 2 , ■ ■ ■ ,rd-i) such that Vi = rank (Xi^’ -’®i), i G 
[d — 1], see [IHj for details. By taking into account only minors of order two 
of the matricizations r G {{1}, {1, 2},..., {1, 2,..., d — 1}}, one may introduce a 
corresponding polynomial ideal Jd.TT- 

Theorem 4.6. The polynomial ideals Jd, Jd,HOSVD, and Jd,TT are equal, for all 
d>3. 

Proof. Let r C [d] represent a matricization. Similarly to the case of order-three 
tensors, for {a, (3) G Xa-^xp-r denotes the monomial where af = at, Pf = Pk 
for all /c G T and aj = Pi, PJ = ai for all i G = [d] \t. Moreover, Xa-r,oXp-r,o 
denotes the monomial where = ak, Pf'^ = Pk for all /c G r and = /3j’° = 0 
for all £ G = [d] \t. The corresponding order-two minors are defined as 

/(a,/3)(^) = -XaXp +Xa--Xi3^, {OL, (3) G T'^. 

We define the set as 

T" = {(a,/3) : ^ ^ . 

Similarly as in the case of order-three tensors, notice that (x) = f^^ (x) = 

“/(a-./ 3 -)(^) = fo'' all («,/3) G First, we show that Jd = 

dd,HOSVD by showing that f{a,f3)i^) G dd,HosvD, for all (a,/3) G T'^ and all |t| > 2. 
Without loss of generality, we can assume that ai pi, for alH G r since otherwise 
we can consider the matricization t\ {i : ai = Pi}. Additionally, by definition of 
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there exists at least one £ £ such that ae ^ pi. Let r = ■ • ■ ,tfe} 

with ti < ti+i, for all i £ [fc — 1] and k > 2. Next, fix {a, (3) £ and define 
= a and /3° = (3. Algorithm results in polynomials gk £ Js.tt such that 
/(a./3)(^) = SLi 5»(x). This follows from 

k k 

^ ^ ^ ^ X^^-3 “h XqjI X|^^ ^ X -\- Xfy^k X f[^cx.,(5)^'^' 

By the definition of polynomials gk it is obvious that 

9i ^ {/(a^/ 3 )W : G : for all i£[k]. 

Next, we show that Jd = Jd,TT- Since Jd = =/d,HOSVD) it is enough to show 


Algorithm 3 Algorithm for proving that Jd = Jd, 


TT 


Input: An ideal J^.tt € K[x], polynomial with = a, /3^ = (3, t = 

■ ■ ■ ,tk}, where k>2 
for i = 1,... ,k do 
Define a® and /3* as 

,j-i 



PT' ifj=f*, 


otherwise 


and Pj := 



if j = U, 

otherwise. 


Define polynomial ( 7 i(x) := —x^i-^Xpi-i +a:Q,ia:^i 

end for 

Output: Polynomials gi, g 2 , ■ ■ ■ > 9k- 


that G Jd,TT, for all {a,P) £ and all k £ [d]. By definition of 

Jd,TT this is true for k = 1. Fix k £ {2, 3, ...,d}, {a, (3) £ and consider 

a polynomial /(x) = //^;^^)(x) corresponding to the second order minor of the 

matricization Xf^f. By definition of Ofc ^ Pk and there exists an index 

i £ [d] \{A:} such that ^ Pi- Assume that i > k. Define the polynomials g{x) £ 
^{i,2,...,k} |/f^Jy ’'=}(x) : (q;,/3) e and d(x) G 7?.f L 2 ,....fc-i} _ 

^/(x) ^ot^f3 3~ ^ckD) 2, 1,2,... ,fc} 

^(x) = Xfy^{l,2,...,k}X^{l,2,...,k} “t“ ./c}{l,2,...,fc-l}T^.{i 2,..,,fc}{l,2,...,fe-l} 

Since a:^{i. 2 „,.,fe}{i. 2 ,...,fe-i}a:^{i, 2 „,.,fc}{i. 2 ,....fc-i} = a:„{fe}a:^{fc}, we have /(x) = g(x) + 
h{x) and thus / G Jd,TT- If * < fc notice that /(x) = gi{x) + di(x), where 

5l(x) = —XaXp + Xfy^{l,2,...,k-l}Xp{l,2,...,k-l} £ 

kl(x) = —X^{l,2,...,k-l}X^(l,2,...,k-l} + X^fi^2,...,k-l}{l,2,...,k}X^fi^2,...,k-l}{l,2,..,k} 

= —X^{l,2,...,fc}X^{l,2.fc} -£ X^2^(k} X^(k} £ 7^f . 

□ 

Remark 6 . Fix a decomposition tree Tj which generates a particular HT-decomposition 
and consider the ideal Jd,iiT,Ti generated by all second order minors corresponding 
to the matricizations induced by the tree Tj. In a similar way as above, one can 
obtain that Jd,wr,Ti equals to Jd- 
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5. Convergence of the uNiT-6>fc-NORM balls 
In this section we show the following result on the convergence of the unit 0fc-balls 


Theorem 5.1. The theta body sequence of Jd converges asymptotically to the 
conv(i^(J)), i.e., 

OO 

Pi TUkiJd) = conv (j^(Jrf)). 


To prove Theorem 5.1 we use the following result presented in [5] which is a 
consequence of Schmiidgen’s Positivstellensatz. 


Theorem 5.2. Let J be an ideal such that vk_{J) is compact. Then the theta body 
sequence of J converges to the convex hull of the variety i/g_(J), in the sense that 

OO 

P THfc(J) = conv {i^{J )). 

k^l 

Proof of Theorem \5.1\ The set v-s,{Jd) is the set of rank-one tensors with unit Frobe- 
nius norm which can be written as UK{Jd) = Ai C\A 2 where 

yli = {X G K"ix"2x...xn, . j.ank(x) = 1} , 
and .42 = {X G K"iXn 2 X...xn, . ||X||^ = 1 } , 

It is well-known that Ai is closed m discussion before Definition 2.2] and since A 2 
is clearly compact, VRiJd) is compact. Therefore, the result follows from Theorem 

0 □ 


6. Computational Complexity 


The computational complexity of the semidefinite programs for computing the 
01-norm of a tensor or for minimizing the 0i-norm subject to a linear constraint 
depends polynomially on the number of variables, i.e., on the size of B 2 k, and on the 
dimension of the moment matrix M. We claim that the overall complexity scales 
polynomially in n, where for simplicity we consider dth-order tensors in ]g«xnX'"Xn^ 
Therefore, in contrast to tensor nuclear norm minimization which is NP-hard for 
d > 3, tensor recovery via 0i-norm minimization is tractable. 

Indeed, the moment matrix M is of dimension (1 + n‘^) x (1 -f n‘^) (see also (15) 
for matrices in and if a = denotes the total number of entries of a tensor 


X G 


then the number of the variables is at most 




O(a^) which 


is polynomial in a. (A more precise counting does not give a substantially better 
estimate.) 


7. Numerical experiments 

Let us now empirically study the performance of low rank tensor recovery via 
01-norm minimization via numerical experiments, where we concentrate on third- 
order tensors. Given measurements b = 4*(X) of a low rank tensor X G K"iX" 2 xn 3 ^ 
where $ : K"ixn2xn3 jg ijjjear measurement map, we aim at reconstructing 

X as the solution of the minimization program 

min||Z||e^ subject to $(Z) = b. (22) 

As outlined in Section]^ the 0i-norm of a tensor Z can be computed as the minimizer 
of the semidefinite program 

mint subject to M(t, y, Z) ^ 0, 





TENSOR THETA NORMS AND LOW RANK RECOVERY 


27 


Table 3 . The matrices involved in the definition of the moment 
matrix M (t,y,X). Due to the symmetry only the upper triangle 
part of the matrices is specified. The other non-specified entries 
of the matrices M € K("i’^2n3+i)x(nin2n3+i) column 

are equal to zero. The matrix M corresponds to the element 
g + J3 of the 0 -basis specified in the second column. The index 
I = (i, i,j,j, k, k) is in the range of the last column. The function 
/ : —)■ N is defined as / {i,j, k) = {i — 1)772713 -|- (j — 1)713 -|- A: -|- 1 . 



0-basis 

position (p, q) in the matrix 

Mpq 

Range of 7,7, j, j, fc, fc 

Mo 

1 

(1,1), (2, 2) 

1 


Mjjfe 

^ijk 

(l,/(bA k)) 

1 

i G [ni] ,j G [772] ,fc e [773] 

M^2 

„ 2 . 

(2,2) 

-1 




(/(*,j, k),f{i,j, k)) 

1 

{7 e [771], j e [772] ,fc G [773]} 





\{i=j = k = l} 

M}, 

^ijk^ijk 

(/(*,J, k),f{i,j,k)), 

1 




ifihj, k)J{i,j,k)) 

1 

i G [ni] ,j <j,k <k 

Ml 


(/(*,j, k)Jii,j,k)) 

1 




(/(*,], ^),/(bj,fc)) 

1 




{f{'i,j,k),f{i,j,k)), 

1 




(/(*,j, k),f{i,j,k)) 

1 

i < < j,k < k 



ifihJ, k)J{i,j,k)), 

1 




ifihj, k)Jii,j, k)) 

1 

i <i,j G [772] ,k < k 

M/. 


ifihj, k)Jii,j,k)) 

1 




{f{hj,k),f{i,j,k)) 

1 

i < i,j < j,k G [773] 

M', 


(/(*,j, k)Jii,j, k)) 

1 

i <i,j G [772] ,k G [773] 

M?. 


ifihj, k),f{i,j,k)) 

1 

i G [ni] ,j <j,kG [773] 



{fihj, k)J{i,j, k)) 

1 

i G [771] , j G [772] ,k < k 


where M(f, y, X) = (t, X, y) is the moment matrix of order 1 associated to the 

ideal J3, see Theorem | 4. 2 1 This moment matrix for J3 is explicitly given by 

ni 712 1^3 9 |M^| 

M{t,y,X)=tMo + J2'E'E 

i=l j = l k=l p=2 q=l 

where £ = J 2 r =2 + 9 ’ ~ the matrices Mo,My7j and M~ are 

provided in Table For p G { 2 , 3 ,..., 9 }, the function hp denotes an arbitrary 
but fixed bijection { 1 , 2 ,..., |Mp|} >->• {{i,i,j,j,k, k)}, where I = {i,i,j,j,k, k) is 
in the range of the last column of Table [ 3 ] A s discussed in Section for the general 
case, the 0i-norm minimization problem ( |22[ ) is then equivalent to the semidefinite 
program 

mint subject to M(t,y, Z )^0 and 4 >(Z) = b. ( 23 ) 

t,y,z 

For our experiments, the linear mapping is defined as (^(X))^ = (X, $fc), 
k G [777], with independent Gaussian random tensors G K"ix"2x713^ j 
entries of are independent A£ (O, random variables. We choose tensors 
X G K7iixn2xn3 rank one as X = u 0 v 0 w, where each entry of the vectors u, 
V, and w is taken independently from the normal distribution A/” ( 0 , 1 ). Tensors 
X G K"ixn 2 xn 3 qJ- rank two are generated as the sum of two random rank-one 
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tensors. With $ and X given, we compute b = #(X), run the semidefinite program 
(23) and compare its minimizer with the original low rank tensor X. For a given 
set of parameters, i.e., dimensions 711 , 712 , 713 , number of measurements m and 
rank r, we repeat this experiment 200 times and record the empirical success rate 
of recovering the original tensor, where we say that recovery is successful if the 
elementwise reconstruction error is at most 10“®. We use MATLAB (R2008b) for 
these numerical experiments, including SeDuMi_1.3 for solving the semidefinite 
programs. 

Table [ 4 ] summarizes the results of our numerical tests for cubic and non-cubic 
tensors of rank one and two and several choices of the dimensions. Here, the 
number ttiq denotes the maximal number of measurements for which not even one 
out of 200 generated tensors is recovered and mi denotes the minimal number of 
measurements for which all 200 tensors are recovered. The fifth column in Table H] 
represents the number of independent measurements which are always sufficient 
for the recovery of a tensor of an arbitrary rank. For illustration, we present the 
average cpu time (in seconds) for solving the semidefinite programs via SeDuMi_1.3 
in the last column. Alternatively, the SDPNAL+ Matlab toolbox (version 0.5 beta) 
for semidefinite programming [611163j allows to perform low rank tensor recovery 
via di-norm minimization for even higher-dimensional tensors. For example, with 
m = 95 measurement we managed to recover all rank-one 9x9x9 tensors out of 
200 (each simulation taking about 5min). Similarly, rank-one 11x11x11 tensors 
are recovered from m = 125 measurements with one simulation lasting about 50min. 
Due to these large computation times, more elaborate numerical experiments have 
not been conducted in these scenarios. We remark that no attempt of accelerating 
the optimization algorithm has been made. This task is left for future research. 


Table 4. Numerical results for low rank tensor recovery in R"ixra 2 xri 3 _ 


ni X 77-2 X 77,3 

rank 

mo 

mi 

ni 712 773 

cpu (sec) 

2x2x3 

1 

4 

12 

12 

0.2 

3x3x3 

1 

6 

19 

27 

0.37 

3x4x5 

1 

11 

30 

60 

6.66 

4x4x4 

1 

11 

32 

64 

7.28 

4x5x6 

1 

18 

42 

120 

129.48 

5x5x5 

1 

18 

43 

125 

138.90 

3x4x5 

2 

27 

56 

60 

7.55 

4x4x4 

2 

26 

56 

64 

8.65 

4x5x6 

2 

41 

85 

120 

192.58 


Except for very small tensor dimensions, we can always recover tensors of rank- 
one or two from a number of measurements which is significantly smaller than the 
dimension of the corresponding tensor space. Therefore, low rank tensor recovery 
via di-minimization seems to be a promising approach. Of course, it remains to 
investigate the recovery performance theoretically. 

Figures [2 and [^present the numerical results for low rank tensor recovery via 61 - 
norm minimization for Gaussian measurement maps, conducted with the SDPNAL-t- 
toolbox. For fixed tensor dimensions n x n x n, fixed tensor rank r, and fixed 
number m of measurements 50 simulations are performed. We say that recovery 
is successful if the element-wise reconstruction error is smaller than 10“^. Figures 
[Ta| [2al |3a1 and [Tbl [2bl [3b| present experiments for rank-one and rank-two tensors, 
respectively. The vertical axis in all three figures represents the empirical success 
rate. In Figurej^the horizontal axis represents the relative number of measurements. 
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(a) Recovery of rank-1 tensors (b) Recovery of rank-2 tensors 

Figure 1. Recovery of rank-1 and rank-2 tensors via ^i-norm minimization. 



m/(3nr) 

(a) Recovery of rank-1 tensors 



m/(3nr) 

(b) Recovery of rank-2 tensors 


Figure 2. Recovery of rank-1 and rank-2 tensors via 0i-norm minimization. 


to be more precise, for a tensor of size nxnx ru the number n on the horizontal axis 
represents m = measurements. In Figure 2 for a rank-r tensor of size nxnx n 
and the number of measurements m, the horizontal axis represents the number 
m/{3nr). Notice that 3nr represents the degrees of freedom in the corresponding 
CP-decomposition. In particular, if the number of measurements necessary for 
tensor recovery is m > 3Crn, for an universal constant C, Figurej^suggests that the 
constant C depends on the size of the tensor. In particular, it seems to grow slightly 
with n (although it is still possible that there exists C > 0 such that m > 3Crn 
would always be enough for the recovery). With C = 3.3 we would always be able 
to recover a low rank tensor of size n x n x n with n < 7. The horizontal axis in 
Figure [^represents the number m/ (3nr • log(n)). The figure suggests that with the 
number of measurements m > 6 rn ■ log(n) we would always be able to recover a low 
rank tensor and therefore it may be possible that a logarithmic factor is necessary. 
The computation is implemented in MATLAB R2016a, on an Acer Laptop with 
CPU@1.90GHz and RAM 4GB. 

We remark that we have used standard MATLAB packages for convex optimiza¬ 
tion to perform the numerical experiments. To obtain better performance, new 
optimization methods should be developed specifically to solve our optimization 
problem, or more generally, to solve the sum-of-squares polynomial problems. We 
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0 0.5 1 1.5 2 2.5 3 3.5 

m/(3nr • log(n)) 



m/(3nr • log{n)) 


(a) Recovery of rank-1 tensors 


(b) Recovery of rank-2 tensors 


Figure 3. Recovery of rank-1 and rank-2 tensors via 0i-norm minimization. 


expect this to be possible and the resulting algorithms to give much better per¬ 
formance results since we have shown that in the matrix scenario all theta norms 
correspond to the matrix nuclear norm. The state-of-the-art algorithms developed 
for the matrix scenario can compute the matrix nuclear norm and can solve the 
matrix nuclear norm minimization problem for matrices of large dimensions. The 
theory developed in this paper together with the hrst numerical results should 
encourage the development into this direction. 

Appendix A. Monomial orderings and Grobner bases 

An ordering on the set of monomials x“ G K[x], x“ = is 

essential for dealing with polynomial ideals. For instance, it determines an order in 
a multivariate polynomial division algorithm. Of particular interest is the graded 
reverse lexicographic (grevlex) ordering. 

Definition A.l. For a = (ai,a 2 , ■ ■ ■, an), (3 = {/3i,f32, • ■ •, /3n) € we write 
>greviex (or a >grevlex /9) If 1^1 > |/3| Or |q:| = |,3| and the rightmost nonzero 
entry of a — ,3 is negative. 

Once a monomial ordering is fixed, the meaning of leading monomial, leading 
term and leading coefficient of a polynomial (see Section is well-defined. For 
more information on monomial orderings, we refer the interested reader to [TillTS] . 

A Grobner basis is a particular kind of generating set of a polynomial ideal. It 
was first introduced in 1965 in the Phd thesis of Buchberger [ 5 ] . 

Definition A .2 (Grobner basis). For a fixed monomial order, a basis Q = {^i,..., 
of a polynomial ideal J C K [x] is a Grobner basis (or standard basis) if for all 
/ S M [x] there exist a unique r S K [x] and g G J such that 

f = g + r 

and no monomial of r is divisible by any of the leading monomials in i.e., by any 
of the monomials LM (gi ), LM (52), ■ ■ ■, LM (gs). 

A Grobner basis is not unique, but the reduced version defined next is. 

Definition A. 3 . The reduced Grobner basis for a polynomial ideal J S K [x] is a 
Grobner basis Q = {gi,g2, ■ ■ ■ ,gs\ for J such that 

1 ) LC(gi) = 1 , for all i G [s]. 

2 ) gi does not belong to (LT(t/\{(/i})) for all i G [s]. 
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In other words, a Grobner basis Q is the reduced Grobner basis if for all i G [s] 
the polynomial gi G Q is monic (i.e., LC(gi) = 1) and the leading monomial LM(gi) 
does not divide any monomial of gj, J ^ i- 

Many important properties of the ideal and the corresponding algebraic variety 
can be deduced via its (reduced) Grobner basis. For example, a polynomial belongs 
to a given ideal if and only if the unique r from the Definition |A.2| equals zero. 
Grobner bases are also one of the main computational tools in solving systems of 
polynomial equations [15] . 

—F 

With / we denote the remainder on division of / by the ordered /c-tuple 
F = (/i, / 2 , ■ • ■, fk)- If F is a Grobner basis for an ideal (/i, / 2 , • ■ •, /fe), then we 
can regard F as a. set without any particular order by Definition |A.2[ or in other 
words, the result of the division algorithm does not depend on the order of the 




A.2 


polynomials. Therefore, / = r in Definition 

The following result follows directly from Definition A.2 and the polynomial 
division algorithm m- 


Corollary A. 4 . Fix a monomial ordering and let Q = {51,52, ■ • ■ , 5 s} C K [x] be a 
Grobner basis of a polynomial ideal J. A polynomial / G M [x] is in the ideal J if it 
can be written in the form f = aigi + 0252 + ... + asPs, where G K [x], for all 
i G [s], s.t. whenever aiPi 0 we have 

multideg (/) > multideg ( 0 ^ 5 ^). 


Definition A.5. Fix a monomial order and let Q = {51,52,. •., 5s} C M [x]. Given 
/ G M [x] , we say that / reduces to zero modulo Q and write 

/^eO 

if it can be written in the form / = 0151 + 0252 + ... + a/c5fc with ai G K [x] for all 
i G [fc] s.t. whenever OiPi 0 we have multideg (/) > multideg {aiPi). 


Assume that Q in the above definition is a Grobner basis of a given ideal J. Then 
a polynomial / is in the ideal J if and only if / reduces to zero modulo Q. In other 
words, for a Grobner basis Q, 

f -Gg 0 if and only if = 0 . 

The Grobner basis of a polynomial ideal always exists and can be computed in a 
finite number of steps via Buchberger’s algorithm ladiKii]. 

Next we define the S'-polynomial of given polynomials / and 5 which is important 
for checking whether a given basis of the ideal is a Grobner basis. 


Definition A.6. Let /, 5 G K [x] be a non-zero polynomials. 

(1) If multideg (/) = a and multideg (5) = /3, then let 7 = (71,72,..., 7^), 
where 7 ^ = max {ai, /3i}, for every i. We call x'*' the least common multiple 
of LM (/) and LM (5) written x'’' = LCM (LM (/), LM (5)). 

(2) The S'-polynomial of / and 5 is the combination 

x^ 

s if, 5 ) = - lt ( 5 )^' 

The following theorem gives a criterion for checking whether a given basis of a 
polynomial ideal is a Grobner basis. 


Theorem A.7 (Buchberger’s criterion). A basis Q = { 51 , 52 ,..., 5 s} for a poly¬ 
nomial ideal J C M [x] is a Grobner basis if and only if S {gi,gj) —>g 0 for all 
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Computing whether S igi,gj) —>6 0 for all possible pairs of polynomials in the 
basis Q can be a tedious task. The following proposition tells us for which pairs of 
polynomials this is not needed. 

Proposition A. 8 . Given a finite set Q C M [x], suppose that the leading monomials 
of f,g & G are relatively prime, i.e., 

LCM (LM (/), LM ig)) = LM (/) LM (g ), 

then S (/, g) -^g 0. 

Therefore, to prove that the set tj C M [x] is a Grobner basis, it is enough to show 
that S {gi,gj) -^g 0 for those i < j where LM (g^) and LM (gj) are not relatively 
prime. 
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